Skip to content

LoongArch64 LSX fast-path for str.contains(&str) #144393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2025

Conversation

heiher
Copy link
Contributor

@heiher heiher commented Jul 24, 2025

Benchmark results with LLVM 21 on LA664:

OLD:
test bench_is_contained_in ... bench:          43.63 ns/iter (+/- 0.04)

NEW:
test bench_is_contained_in ... bench:          12.81 ns/iter (+/- 0.01)

@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2025

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 24, 2025
@heiher heiher force-pushed the str-contains-lsx branch from 1e2fd99 to 1ceacf5 Compare July 29, 2025 13:37
Benchmark results with LLVM 21 on LA664:

```
OLD:
test bench_is_contained_in ... bench:          43.63 ns/iter (+/- 0.04)

NEW:
test bench_is_contained_in ... bench:          12.81 ns/iter (+/- 0.01)
```
@heiher
Copy link
Contributor Author

heiher commented Jul 29, 2025

@bors r=tgross35

@bors
Copy link
Collaborator

bors commented Jul 29, 2025

@heiher: 🔑 Insufficient privileges: Not in reviewers

@tgross35
Copy link
Contributor

Thanks!

@bors r+

@bors
Copy link
Collaborator

bors commented Jul 29, 2025

📌 Commit 1ceacf5 has been approved by tgross35

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 29, 2025
@bors
Copy link
Collaborator

bors commented Jul 29, 2025

⌛ Testing commit 1ceacf5 with merge ba7e63b...

@bors
Copy link
Collaborator

bors commented Jul 29, 2025

☀️ Test successful - checks-actions
Approved by: tgross35
Pushing ba7e63b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 29, 2025
@bors bors merged commit ba7e63b into rust-lang:master Jul 29, 2025
11 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Jul 29, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 686bc1c (parent) -> ba7e63b (this PR)

Test differences

Show 2 test diffs

2 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard ba7e63b63871a429533c189adbfb1d9a6337e000 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. x86_64-apple-1: 6947.6s -> 10441.2s (50.3%)
  2. dist-x86_64-apple: 7315.4s -> 10255.6s (40.2%)
  3. dist-aarch64-apple: 7478.4s -> 5085.3s (-32.0%)
  4. aarch64-msvc-1: 6960.6s -> 9107.5s (30.8%)
  5. x86_64-gnu-distcheck: 8550.9s -> 7795.3s (-8.8%)
  6. aarch64-apple: 4886.7s -> 5221.5s (6.9%)
  7. tidy: 107.5s -> 114.4s (6.4%)
  8. dist-riscv64-linux: 4780.5s -> 5080.3s (6.3%)
  9. x86_64-apple-2: 3684.0s -> 3471.8s (-5.8%)
  10. dist-aarch64-linux: 5635.6s -> 5899.6s (4.7%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ba7e63b): comparison URL.

Overall result: ❌ regressions - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.9% [0.9%, 1.0%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (primary 5.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
5.6% [5.6%, 5.6%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 5.6% [5.6%, 5.6%] 1

Cycles

Results (secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.7% [-3.1%, -2.3%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 469.682s -> 469.063s (-0.13%)
Artifact size: 376.80 MiB -> 376.81 MiB (0.00%)

@rustbot rustbot added the perf-regression Performance regression. label Jul 30, 2025
@heiher heiher deleted the str-contains-lsx branch July 30, 2025 05:05
@lqd
Copy link
Member

lqd commented Jul 30, 2025

match-stress noise

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 30, 2025
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Jul 30, 2025
LoongArch64 LSX fast-path for `str.contains(&str)`

Benchmark results with LLVM 21 on LA664:

```
OLD:
test bench_is_contained_in ... bench:          43.63 ns/iter (+/- 0.04)

NEW:
test bench_is_contained_in ... bench:          12.81 ns/iter (+/- 0.01)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants